logic regression
Copulaboost: additive modeling with copula-based model components
Brant, Simon Boge, Haff, Ingrid Hobæk
We propose a type of generalised additive models with of model components based on pair-copula constructions, with prediction as a main aim. The model components are designed such that our model may capture potentially complex interaction effects in the relationship between the response covariates. In addition, our model does not require discretisation of continuous covariates, and is therefore suitable for problems with many such covariates. Further, we have designed a fitting algorithm inspired by gradient boosting, as well as efficient procedures for model selection and evaluation of the model components, through constraints on the model space and approximations, that speed up time-costly computations. In addition to being absolutely necessary for our model to be a realistic alternative in higher dimensions, these techniques may also be useful as a basis for designing efficient models selection algorithms for other types of copula regression models. We have explored the characteristics of our method in a simulation study, in particular comparing it to natural alternatives, such as logic regression, classic boosting models and penalised logistic regression. We have also illustrated our approach on the Wisconsin breast cancer dataset and on the Boston housing dataset. The results show that our method has a prediction performance that is either better than or comparable to the other methods, even when the proportion of discrete covariates is high.
- North America > United States > Wisconsin (0.24)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- Europe > Norway (0.04)
Rejoinder for the discussion of the paper "A novel algorithmic approach to Bayesian Logic Regression"
Hubin, Aliaksandr, Storvik, Geir, Frommlet, Florian
We would like to begin this rejoinder with expressing our sincere gratitude to all of the discussants for their interesting and thought-provoking comments and remarks. We also feel heartily thankful to the editorial board of Bayesian Analysis for giving us the opportunity to publish our paper entitled "A novel algorithmic approach to Bayesian logic regression" (Hubin et al., 2020a) as a discussion article. Logic regression is a tool to model nonlinear relationships between binary covariates and some response variable by constructing predictors as Boolean combinations. The number of possible logic expressions grows exponentially with the number of binary variables involved, making the model search significantly harder with the increasing complexity of Boolean combinations. Due to Boolean equivalence, it is in fact almost impossible to specify the full model space a priori even for a relatively small number of covariates.
- Information Technology > Modeling & Simulation (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)
Big Data Analysis Using Modern Statistical and Machine Learning Methods in Medicine - Europe PMC Article - Europe PMC
In this article we introduce modern statistical machine learning and bioinformatics approaches that have been used in learning statistical relationships from big data in medicine and behavioral science that typically include clinical, genomic (and proteomic) and environmental variables. Every year, data collected from biomedical and behavioral science is getting larger and more complicated. Thus, in medicine, we also need to be aware of this trend and understand the statistical tools that are available to analyze these datasets. Many statistical analyses that are aimed to analyze such big datasets have been introduced recently. However, given many different types of clinical, genomic, and environmental data, it is rather uncommon to see statistical methods that combine knowledge resulting from those different data types. To this extent, we will introduce big data in terms of clinical data, single nucleotide polymorphism and gene expression studies and their interactions with environment. In this article, we will introduce the concept of well-known regression analyses such as linear and logistic regressions that has been widely used in clinical data analyses and modern statistical models such as Bayesian networks that has been introduced to analyze more complicated data. Also we will discuss how to represent the interaction among clinical, genomic, and environmental data in using modern statistical models. We conclude this article with a promising modern statistical method called Bayesian networks that is suitable in analyzing big data sets that consists with different type of large data from clinical, genomic, and environmental data.
- Europe (0.76)
- North America > United States > California (0.14)
- North America > United States > Washington (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Health & Medicine > Therapeutic Area > Oncology (1.00)
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)